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The role of mentorship on protege performance is a matter of importance to academic, business, and 
governmental organizations. While the benefits of mentorship for proteges, mentors and their or- 
ganizations are apparent the extent to which proteges mimic their mentors' career choices and 
acquire their mentorship skills is unclear QStSl. Here, we investigate one aspect of mentor emula- 
tion by studying mentorship fecundity — the number of proteges a mentor trains — with data from 
the Mathematics Genealogy Project which tracks the mentorship record of thousands of mathe- 
maticians over several centuries. We demonstrate that fecundity among academic mathematicians 
is correlated with other measures of academic success. We also find that the average fecundity of 
mentors remains stable over 60 years of recorded mentorship. We further uncover three significant 
correlations in mentorship fecundity. First, mentors with small mentorship fecundity train proteges 
that go on to have a 37 % larger than expected mentorship fecundity. Second, in the first third of their 
career, mentors with large fecundity train proteges that go on to have a 29% larger than expected 
fecundity. Finally, in the last third of their career, mentors with large fecundity train proteges that go 

on to have a 31% smaller than expected fecundity. 

A large body of literature supports the hypothesis that proteges and mentors benefit from the mentor- 
ing relationship t^Sl Proteges that receive career coaching and social support, for instance, are reportedly 
more likely to have high performance ratings, a higher salary, and receive promotions In return, mentors 
receive fulfillment not only by altruistically improving the welfare of their proteges, but also by improv- 
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ing their own welfare SHS! Organizations benefit as well, since proteges are more likely to be committed 
to their organization 1^ and exhibit organizational citizenship behavior ^. These benefits are not only ob- 
tained through the traditional dyadic mentor-protege relationship, but also through peer relationships that 
supplement protege development 

The benefits of mentorship underscore the importance of understanding how mentors were in turn 
trained to foster the development of outstanding mentors. One might suspect that proteges learn manage- 
rial approaches and motivational techniques from their mentors and, as a result, emulate their mentorship 
methodologies; this suggests that outstanding mentors are trained by other outstanding mentors. This pos- 
sibility is sometimes formalized as the rising star hypothesis QHS it postulates that mentors select up-and- 
coming proteges based on their perceived ability, potential and past performance GfilGini!, including promo- 
tion history and proactive career behaviors Rising-star proteges are reportedly more likely to "intend to 
mentor", resulting in a "perpetual cycle" of rising-star proteges that emulate their mentors by seeking other 
rising stars as their proteges 

However, there is conflicting evidence concerning the rising star hypothesis so the extent to which 
proteges mimic their mentors remains an open question. Indeed, we are unaware of any studies that sys- 
tematically track mentorship success over the entire career of a mentor, so the validity of the rising star 
hypothesis has yet to be fully explored. Here, we investigate whether proteges acquire the mentorship skills 
of their mentors by studying mentorship fecundity — the number of proteges that a mentor trains over the 
course of their career. This measure is advantageous as it directly measures an outcome of the mentor- 
ship process that is relevant to sustained mentorship, allowing us to quantify the degree to which mentor 
fecundity determines protege fecundity. 

Scientific mentorship offers a unique opportunity for studying this question because there is a struc- 
tured mentorship environment between advisor and student that is, in principle, readily accessible ISnil_ 
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We study a prototypical mentorship network collected from the Mathematics Genealogy Project which 
aggregates the graduation date, mentor, and proteges of 114,666 mathematicians from as early as 1637. 
From this information, we construct a network where links are formed from a mentor to each of his k 
proteges, where k denotes mentorship fecundity. This database is unique because it explicitly tracks the 
career-long mentorship record of a large population of mentors within a single discipline. We focus here on 
the 7,259 mathematicians that graduated between 1900 and 1960 since their mentorship record is the most 
reliable (see Methods). 

Although the mentorship records gathered from the Mathematics Genealogy Project provide the most 
comprehensive data source available for studying academic performance throughout a mathematician's ca- 
reer, there aie obviously other plausible metrics for evaluating academic performance I^Sllll, We have also 
compared the mentorship data against a list of publications for 4,447 mathematicians and a list of 269 in- 
ductees into the United States' National Academy of Sciences (NAS) (see Methods). We find that mentor- 
ship fecundity is much larger for NAS members than for non-NAS members (Fig. la). We further find that 
the number of publications is strongly correlated with fecundity, regardless of whether or not a mathemati- 
cian is a NAS member (Fig. lb). These results demonstrate that, although fecundity is not a typical measure 
of academic performance, it is closely related to other measures of academic success. Thus, even though 
our investigation concerns how fecundity is correlated between mentor and protege, our results also address 
questions in the academic evaluation literature concerning the success of a mathematician. 

We first investigate whether it is possible to predict the fecundity of a mathematician by modeling 
the fecundity distribution p{k\t) as a function of graduation year t. Considering that some mathematicians 
remain in academia throughout their career while others spend only a portion of their career in academia, 
one might expect that there are two types of individuals when it comes to academic mentorship fecundity — 
"haves" and "have-nots" — in the sense that these mathematicians have or have not had the opportunity 
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to mentor students throughout their career. If each mentor chooses to train a new academic protege with 
probability or and stops training academic proteges otherwise, depending on whether they are a 
"have" or "have-not" respectively, then we would expect that the resulting fecundity distribution is a mixture 
of two discrete exponential distributions 

p{k\e) = TThp{k\Kh) + (1 - 7:h)p{k\Khn) , (1) 

where TT/i is the probability that a mathematician is a "have", and p{k\Kh) and p{k\Kfin) are discrete exponen- 
tial distributions p{k\K) = e^^^'^{l — e^^l'^') with average fecundity = and Kf^ = 
for "haves" and "have-nots" respectively. We estimate the parameters = {vr^, k/d '^/m} of this distri- 
bution from the empirical data using expectation-maximization Using Monte Carlo hypothesis testing 
(see Methods), we have found that Eq. ([T]) can not be rejected as a candidate description of the fecundity 
distribution p{k\t). For an alternative description of the fecundity distribution p{k\t), see Supplementary 
Discussion and Fig. S 1 . 

As one might expect, the probabiUty vr/, that an individual is a "have" experiences dramatic changes 
over time due to historical events, such as two World Wars, the beginning of the Cold War, and considerable 
increases in academic funding (Fig. 2b). In contrast, the average fecundities of "haves" and "have-nots" do 
not exhibit systematic historical changes — Kh = 9-8 ± 0.4 and Khn = 0.47 ± 0.03 — suggesting that these 
quantities offer fundamental insight into the mentorship process among mathematicians (Fig. 2c-d). 

The stationarity of and also provides a simple heuristic for classifying an individual as a 
"have" or a "have-not"; by maximum likelihood, an individual is a "have" if A; > 2 and a "have-not" 
otherwise. These results raise the possibility that similar features, perhaps with different characteristic 
scales of fecundity, may be present in other mentorship domains. 

While our description of the fecundity distribution has highlighted a fundamental property of men- 
torship among mathematicians, it is not predictive of the behavior of individual mathematicians in the sense 
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that fecundity, according to this model, is a random variable drawn from a distribution of Eq. ([Til. We next 
test whether proteges mimic the mentorship fecundity of their mentors by comparing protege fecundity with 
a suitable null model that does not introduce correlations in fecundity. In analogy with ancestral genealo- 
gies and their notion of parents giving birth to children, networks generated from uncorrelated branching 
processes offer a useful and appropriate context for studying the mathematician genealogy network. Here, a 
graduation date is equivalent to a birth date and mentors and proteges are equivalent to parents and children, 
respectively. We will consequently use the subscripts p and c when it is necessary to make generational 
statements relating parents and children. 

In a branching process a parent p, bom at time tp, has kp children. A child c of parent p is born at 
time tc and subsequently has children. The fecundity k of each individual is drawn from the conditional 
fecundity distribution p{k\t) for an individual bom at time t. Networks generated from this type of branching 
process are therefore defined by the birth date of each individual t, the fecundity distribution p{k\t), and the 
chronology of child births {tc} for each parent (Fig. 3a). 

We compare the mathematician genealogy network with two ensembles of randomized genealogies 
from the branching process family. Random networks from Ensemble I retain the birth date of each individ- 
ual t, the fecundity k of each individual, and the chronology of child births {tc} for each parent (Fig. 3b). 
Random networks from Ensemble II additionally restrict parent-child pairs to have the same age difference 
{tc — tp) as parent-child pairs in the empirical network (Fig. 3c). All other attributes of these networks 
are randomized using a link switching algorithm (see Methods) EES so neither of these random network 
ensembles introduces correlations between parent fecundity and child fecundity or temporal correlations in 
fecundity, providing a suitable basis for comparison with the mathematician genealogy network. 

To explore the influence of mentor fecundity and age difference on protege fecundity, we partition 
proteges according to the fecundity of their mentors and the age difference between mentor and protege 
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{tc — tp). Given our findings (see Supplementary Discussion, Figs. S2-S3), it is clear that age differences 
impact fecundity in a non-random manner for proteges whose mentors have kp < 3. We partition the 
remaining proteges whose mentors have kp > 3 into two groups: proteges whose mentors are below- 
average "haves" {3 < kp < 10) and proteges whose mentors are above-average "haves" {kp > 10). We 
then partition these three groups of proteges according to when they graduated during their mentors' career. 
Specifically, we spUt each group of proteges into terciles, the most fine-grained grouping that still gives us 
sufficient power to examine the statistical significance of any differences between the empirical data and the 
null models. 

We use the partitioning of children into classes to examine the relationship between the average child 
fecundity {kc) and the age difference tc — tp between parent and child (Figs. 4a,b and S4a,b). If the data 
are consistent with a branching process, then we would expect the average child fecundity (fcc) to exhibit no 
temporal dependence. However, the regressions between the average child fecundity z-score (see Methods) 
and the age difference between parent and child tc — tp deviate significantly (Figs. 4c and S4c) from this 
expectation for both random ensembles to reveal three distinct features. First, mentors with kp < 3 train 
proteges that go on to have a 37% larger than expected mentorship fecundity throughout their career. Second, 
in the first third of their career, mentors with kp > 10 train proteges that go on to have a 29% larger than 
expected fecundity. Finally, in the last third of their career, mentors with A;^ > 10 train proteges that go on 
to have a 31% smaller than expected fecundity. 

The fact that mentors with k < 3 train proteges with larger than expected fecundity throughout their 
career is somewhat counter-intuitive. According to the rising star hypothesis EEII^ one might have expected 
that proteges trained by mentors with < 3 are likely to mimic their mentors and therefore have smaller 
than expected fecundity. Our results demonstrate that this is not the case. One possible explanation is that 
mentors with /c < 3 are more aware of the resources they must allocate for effective mentorship, leading to 
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a more enriching mentorship experience for their proteges. An alternative hypothesis is that mentors with 
k < 3 select for, or are selected by, proteges that have a greater aptitude for mentorship. 

The striking temporal correlations for mentors with kp > 10 are intriguing as well. Since mentors 
with kp > 10 represent the upper echelon of mentors in mathematics, these mentors are likely "rising stars" 
early in their academic career. The fact that these mentors train proteges with large fecundity early in their 
career supports the rising star hypothesis. 

By the end of these mentor's careers, however, their proteges have smaller than expected fecundity. 
Perhaps mentors, who ultimately have large fecundity, spend less and less resources training each of their 
proteges as their career progresses. Alternatively, proteges with large mentorship fecundity aspirations might 
court prolific mentors early in their mentor's career whereas proteges with small fecundity aspirations might 
court prohfic mentors later in their mentor's career. Our findings therefore reveal interesting nuances to the 
rising star hypothesis. 

It is unclear whether the temporal correlations we uncover in mentorship fecundity might generalize 
beyond mathematicians in academia. Anecdotally, mathematicians are thought to perform their best work at 
a young age a perception that may influence how mentors and proteges choose each other. Perceptions 
in other domains, however, may differ and subsequently influence mentor and protege selection in different 
ways. As data for other academic disciplines H^USI^ business and the government becomes available, it will be 
important to determine whether temporal correlations in fecundity are a general consequence of mentorship, 
or a particular consequence of mentorship for mathematicians in academia. 

Regardless, our results offer another means of judging academic impact in science as well as the im- 
pact of managers on their employees, both of which are notoriously complicated and risky affairs. These as- 
sessments are multi-dimensional, metrics and expectations are domain dependent, and placement of creative 
output, time-scales of impact and recognition vary significantly from field to field. Ultimately, assessment 

7 



of individuals for awards and promotion is based on painstaking individual analysis by selection committees 
and peers. While these committees may have varying goals and incentives, it is important that collective 
arguments — the kind of arguments we are making here — be based on sound quantitative analysis. Although 
the extent to which our findings extrapolate to other domains may vary, we are confident that the kind of 
analysis presented here will serve to elevate the discourse on scientific and managerial impact. 

Methods summary 

Data acquisition. We use data from the Mathematics Genealogy Project ^ to identify the 7,259 protege 
mathematicians that are in the giant component I^S! and graduated between 1900 and 1960, of which 4,447 of 
them have linked publication records through MathSciNet. We use a text matching algorithm to semi- 
automatically match members of the National Academy of Science with mathematicians from the Mathe- 
matics Genealogy Project. 

Monte Carlo hypothesis testing for p{k\t). We use Monte Carlo hypothesis testing to determine 
whether Eq. ([T|l with maximum-likelihood 1^ parameters 6 can be rejected as a candidate model for p{k\t) 
at the a = 0.05 significance level. 

Random network generation. We use a variation of the Markov chain Monte Carlo algorithm I^IESl 
construct each of the 1 ,000 random networks in Ensembles I and II. Specifically, we restrict the switching 
of endpoints of Unks p — )■ c that belong to the same link class C, where the link classes are defined as 
Ci{t) = {p ^ c\tc = t} and Cu{s,t) = {p — )• c\tp = s,tc = t} for networks from Ensembles I and II, 
respectively. Each link class C can be thought of as a subgraph, which can then be randomized in the usual 
way by attempting 100 switches per link in each link class £l2II21l_ 

Average fecundity z-score. By the central limit theorem, the average of variates drawn from p{kc\tc) is 
normally distributed since p{kc\tc) is well-described by a mixture of discrete exponential distributions, a 
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distribution with finite variance. Given a set of child fecundities Kc = {kc}, we quantify how significantly 
a subset of these child fecundities K* c Kc deviates from Kc by measuring the z-score of the average 
child fecundity (kc) of all nodes within the subset K* compared with the average child fecundity {kc)s 
computed from children within an equivalent subset K* in the synthetic networks. That is, we compute 
z = {{kc) — ij)/cF where jx is the ensemble average of {{kc)s} and a is the standard deviation of the 
ensemble {{kc)s} over the 1,000 realizations generated for our null models. 
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Methods 

Mathematics Genealogy Project data 

We study a prototypical mentorship network collected from the Mathematics Genealogy Project which 
aggregates the graduation date, mentor, and advisees of 114,666 mathematicians from as early as 1637. 
From this information, we construct a mathematician genealogy network where links are formed from a 
mentor to each of his k proteges. 

The data collected by the Mathematics Genealogy Project are self-reported, so there is no guarantee 
that the observed genealogy network is a complete description of the mentorship network. In fact, 16,147 
mathematicians do not have a recorded mentor and, of these, 8,336 do not have any recorded proteges. To 
avoid having these mathematicians distort our analysis, we restrict our analysis to the 90,21 1 mathematicians 
that comprise the giant component ^ of the network; that is, we restrict our analysis to the largest set of 
connected mathematicians in the mathematician genealogy network. 

Although the Mathematics Genealogy Project contains information on mathematicians from as early 
as 1637, this does not necessarily indicate that all of these records are representative of the evolution of the 
network. For example, prior to 1900, the Project records fewer than 52 new graduates per year worldwide. 
Furthermore, since mathematicians oftentimes have mentorship careers lasting 50 years or more (Fig. ??), 
we are not guaranteed to have complete mentorship records for mathematicians that graduated after 1960. 
We therefore restrict our analysis to the 7,259 protege mathematicians that graduated between 1900 and 
1960, for whom we believe that the graduation and mentorship record is the most reliable. 
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MathSciNet data 

Of the 7,259 protege mathematicians that graduated between 1900 and 1960, 4,447 of them have linked 
MathSciNet publication records which are used in our analysis. 

U.S. National Academy of Science data 

The United States' National Academy of Science maintains two databases of its membership. The first 
database consists of all deceased members elected to the Academy from as early as 1863. This database 
records the name of the inductee, their election year, their date of death, and a link to a biographical sketch. 
The second database consists of all active members of the Academy. This database records the name of the 
inductee, their institution, their academic field, and their election year. 

The challenge to matching this data with the Mathematics Genealogy Project data is that there is 
no direct link between a member of the National Academy and Mathematics Genealogy Project page and 
vice versa. This ambiguity is somewhat confounded by the fact that some members of the Academy have 
common names. To circumvent these problems, we use a text matching algorithm ^ to semi-automatically 
detect if a member of the Academy matches a name in the Mathematics Genealogy Project database. We 
use this procedure to curate the 269 members of the Academy that definitively match mathematicians in the 
Mathematics Genealogy Project database. 

Monte Carlo hypothesis testing for p{k\t) 

Given a model M with parameters 6t for the empirically observed fecundity distribution p{k\t), we use 
Monte Carlo hypothesis testing to determine whether the model A4 can be rejected as a candidate model 
for p{k\t) The Monte Carlo hypothesis testing procedure is as follows. First, we calculate the best- 
estimate parameters 9t for model JH at time t using maximum likelihood estimation 1^3. Second, we compute 
the test statistic S (detailed below) between the model Ai{Ot) and the empirical fecundity distribution 
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p{k\t). We next generate a synthetic fecundity distribution Ps{k) from model using the best-estimate 

parameters Ot, and we treat the synthetic data exactly the same as we treated the empirical data: first, we 
calculate the best-estimate parameters Os for model A4 from maximum likelihood estimation; second, we 
compute the test statistic Ss between the model A4{ds) and the synthetic fecundity distribution Ps{k). We 
generate synthetic fecundity distributions Ps{k) and their corresponding synthetic test statistics Ss until we 
accumulate an ensemble of 1,000 Monte Carlo test statistics {Ss}- Finally, we calculate a two-tailed P- 
value with a precision of 0.001. As is customary in hypothesis testing, we reject the model A4 at time t if 
the P-value is less than a threshold value. We select a P-value threshold of 0.05; that is, if less than 5% of 
the synthetic data sets exhibit deviations in the test statistic that are larger than those observed empirically, 
the model is rejected at time t. 

Since we are conducting hypothesis tests with the fecundity distribution p{k\t) — a distribution with a 
discrete support — it is important to use a test statistic S that is appropriate for testing discrete distributions. 
We use the test statistic where we bin p{k\t) such that each bin has at least one expected observation 
according to the model M{Ot). This binning prevents observations that are exceptionally rare from domi- 
nating our statistical test and skewing our results. 

Random network generation 

We use Markov chain Monte Carlo algorithm EUMI to build random networks from the mathematician ge- 
nealogy network. The standard version of this algorithm inherently preserves the fecundity of each in- 
dividual, but it does not preserve the chronology of child births {tc} for each parent. To obtain random 
networks belonging to Ensemble I or Ensemble II, we restrict the switching of endpoints of links p ^ c 
that belong to the same link class £, where the link classes ai^e defined as Ci{t) = {p — )• c\tc = t} and 
Cu{s,t) = {p — )■ c\tp = s,tc = t} for networks from Ensembles I and II, respectively. Each link class 
C can be thought of as a subgraph, which can then be randomized using the Markov chain Monte Carlo 
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algorithm. Here, we attempt 100 switches per link in each link class L, which sufficiently alters random 
networks away from the original empirical network 1^3211 We repeat this procedure 1,000 times to generate 
a set of 1 ,000 random networks for each ensemble. 

Average fecundity z-score 

The average of variates drawn from •p(k(\tc) is normally distributed since 'p{kc\^c) is well-described by a 
mixture of discrete exponential distributions, a distribution with finite variance, and thus the central limit 
theorem applies. Given a set of child fecundities Kc = {kc}, we quantify how significantly a subset of 
these child fecundities K* C deviates from Kf. by measuring the z-score of the average child fecundity 
{kc) of all nodes within the subset K* compared with the average child fecundity {kc)s computed from 
children within an equivalent subset K* in the synthetic networks. That is, we compute z = {{kc) — fi)/a 
where fi is the ensemble average of {{kc)s} and a is the standard deviation of the ensemble {{kc)s} over 
the 1,000 realizations generated for our null models. 
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Figure 1 Relationship between mentorship fecundity and other performance metrics, a, Cu- 
mulative distribution of the mentorship fecundity for NAS members (red) and non-NAS members 
(black). NAS members have an average fecundity of (fc)NAs = 14, which is far greater than the av- 
erage fecundity of non-NAS members (A;)non-NAs = 3.1, indicating that fecundity is closely related 
to academic recognition. Note that not all mathematicians in the non-NAS group were elegible for 
NAS membership due to citizenship and other circumstances. This fact makes the result in the 
figure all the more striking, b. Average and standard error (symbols and error bars) of the number 
of publications as a function of the mentorship fecundity for NAS members (red) and non-NAS 
members (black). NAS members have nearly twice as many publications on average as non-NAS 
members for all fecundity levels. 
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Figure 2 Evolution of tine fecundity distribution, a-c, Cumulative distribution of the fecundity of 
mathematicians that graduated during 1910, 1930, and 1950 (symbols) compared with the best- 
estimate predictions of a mixture of two discrete exponentials (lines). Monte Carlo hypothesis 
testing confirms that this model can not be rejected as a model of the fecundity distribution during 
every year from 1900-1960, as denoted by the P-values above the a = 0.05 significance level 
(see Methods), d-f. Best-estimate parameters calculated by maximum likelihood for a mixture of 
two discrete exponentials as a function of time. Dashed lines denote average parameter values 
between 1900-1960 and colored circles denote the years displayed in panels a-c. The probability 
TTh of being an "have" changes over time, generally corresponding with historic events (hashed 
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grey shading). In contrast, the average fecundities = 9.8 ± 0.4 and Khn = 0.47 ± 0.03 remain 
stable until 1960, at which point they steadily decrease (grey shaded region), corresponding with 
the time at which mentorship records become incomplete (see Methods). 
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Figure 3 Branching process null models, a, Subset of the mathematician genealogy network. 
Mentors/parents (black circles) are connected to each of their proteges/children (white circles). 
The horizontal positions of mathematicians represent their graduation/birth date t. The bottom 
two parents were born in 1924, the top two parents were born in 1937, and all four parents have 
a child born in 1958. From the parent's perspective, three essential features of the empirical 
network must be preserved in random networks generated from the two branching process null 
models: the birth date tp, the fecundity kp, and the chronology of child births {tc}. b. Random 
networks from Ensemble I preserve these three essential features. Solid lines highlight the links 
in the empirical network whose end points can be randomized. Dashed lines illustrate one of 
the possible randomization moves after switching the corresponding pair of links. Note that the 
age difference between parent and child is not preserved, c. Random networks from Ensemble II 
preserve these three essential features as well as the age difference between parent and child. 
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Solid lines of the same color highlight the links in the empirical network whose end points can be 
randomized. Dashed lines illustrate one of the possible randomization moves after switching the 
corresponding pair of links. Random networks for each ensemble are generated by attempting 
100 switches per link (see Methods). 
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Figure 4 Effect of age difference tc - tp between mentor and protege on protege fecundity, a, 
Fecundity distribution of cliildren born during tine 1910s from parents witli kp < 3, 3 < kp < 10, 
and kp > 10 compared with the expectation from Ensemble I (grey line). We separate children 
into terciles (early, middle, late) according to the time difference in birth dates tc - tp between 
parents and children. Note that the average fecundity of children born from parents with kp < 3 
is larger than expected, regardless of whether they were born during the early, middle, or later 
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part of their parent's life. Note also that the average fecundity of children born from parents with 
kp > 10 decreases throughout their parent's life, b, We quantify the significance of these trends 
during each decade (colored symbols) by computing the z-score of the average child fecundity 
(fee) compared with the average child fecundity in networks from Ensemble 1. This information 
is summarized by identifying the linear regression (solid black line). Note that the regression 
lines for networks from our null model (grey lines) vary around the expectation of our null model 
(dashed black line), c, Significance of linear regressions in panel b. We compare the slope and 
intercept of the empirical regression (black circle) with the distribution of the slope and intercept 
of the same quantities computed from the null model. Since these quantities are approximately 
distributed as a multivariate Gaussian, we compute the equivalent of a two-tailed P-value by finding 
the fraction of synthetically generated slope-intercept pairs that lie outside of the equi-probability 
surface of the multivariate Gaussian (dashed ellipse). Note that the slope and intercept of the 
regression for children from parents with small {p = 0.009) and large fecundity {p < 0.001) are 
significantly different from the expectation for the null model, consistent with the data displayed in 
panel a. Comparisons with expectations from random networks from Ensemble II yield the same 
conclusions (Fig. S4). 
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